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1 Introduction 

Lagrangian relaxation and dnal decomposition are extremely effective in solving large-scale convex 
optimization problems [1-6]. Dual decomposition has also been employed successfully in the field of 
distributed convex optimization, where the optimization problem requires to be decomposed among 
cooperative computing entities (called in the following simply by nodes). In this case, the optimiza¬ 
tion problem is generally divided into two steps, a first step pertaining the calculation of the local 
subgradients of the Lagrangian dual function, and a second step consisting of the global update of the 
dual variables by projected subgradient ascent. The first step can typically be performed in parallel 
on the nodes, whereas the second step has often to be performed centrally, by a so-called master node 
(or data-gathering node, or fusion center), which combines the local subgradient information. 

Even though by solving the dual problem, one obtains a lower bound on the optimal value of the 
original convex problem, in practical situations one would also like to have access to an approximate 
primal solution. However, even with the availability of an approximate dual optimal solution, a 
primal one cannot be easily obtained. The reason is that the Lagrangian dual function is generally 
nonsmooth at an optimal point, thus an optimal primal solution is not a trivial combination of 
the extreme subproblem solutions. Methods to recover approximate (near-optimal) primal solutions 
from the information coming from dual decomposition have been proposed in the past [4, 7-13] (and 
references therein). In one way or another, all these methods use a combination of all the approximate 
primal solutions that are generated while the dual decomposition scheme converges to a near-optimal 
dual solution. A possible choice for the combination is the ergodic mean [4, 11, 14]. 

Among the dual decomposition schemes with primal recovery mechanism available in the litera¬ 
ture, we are interested here in the ones that employ a constant stepsize in the projected dual sub¬ 
gradient update. The reasons are twofold. First of all, a constant stepsize yields faster convergence 
to a bounded error floor, which is fundamental in real-time applications (e.g., control of networked 
systems). In addition, the error floor can be tuned by trading-off the number of iterations required 
and the value of the stepsize. The second reason is that in many situations the underlying convex 
optimization problem is not stationary, but changes over time. Having in mind the development of 
methods to update the dual variables while the optimization problem varies [15-17], it is of key 
importance to employ a constant stepsize. In this way, the capability of the subgradient scheme to 
track the dual optimal solutions does not change over time due to a vanishing stepsize approach. 

In this paper, we propose a way to remove the need for a master node to collect the local subgradi¬ 
ent information coming from the different nodes and generate a global subgradient. The reason is that 
in distributed systems, the nodes are connected via an ad-hoc network and the communication is of¬ 
ten limited to geographically nearby nodes. It is therefore impractical to collect the local subgradient 
information in one physical location, whereas it is advisable to enable the nodes themselves to have 
access to a suitable approximation of the global subgradient. We use consensus-based mechanisms 
to construct such an approximation. Consensus-based mechanisms have been used in the primal do¬ 
main both with constant stepsizes [18, 19] and with vanishing ones [19-21], however, to the best of 
the authors’ knowledge, they have not been used in the dual domain, and not together with primal 
recovery. An interesting, but different, approach applying consensus on the cutting-plane algorithm 
to solve the master problem has been very recently proposed in [22]. Our main contributions can be 
described as follows. 

First, we develop a constant stepsize consensus-based dual decomposition. Our method enables 
the different nodes to generate a sequence of approximate dual optimal solutions whose dual cost 
eventually converges to the optimal dual cost within a bounded error floor. Under the assumptions 
of convexity, compactness of the feasible set, and Slater’s condition, the convergence goes as 0(l/fc), 
where k is the number of iterations. The error depends on the stepsize and on the number of con¬ 
sensus steps between subsequent iterations k. Furthermore, the nodes are exchanging subgradient 
information only with their nearby neighboring nodes. 

Then, since in our method, each node maintains its own approximate dual sequence, we provide 
an upper bound on the disagreement among the nodes, and we prove its convergences to a bounded 
value. 
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Finally, we propose a primal recovery scheme to generate approximate primal solutions from 
consensus-based dual decomposition. This scheme is proven to converge to the optimal primal cost 
up to a bounded error floor. Once again, under the same assumptions, the convergence goes as 0{l/k) 
and the error depends on the stepsize and on the number of consensus steps. 

Organization. Section 2 describes the problem setting, our main research question, and some sample 
applications. In Section 3, we cover the basics of dual decomposition to pinpoint the main limitation, 
i.e., the need for a master node. We propose, develop, and investigate the convergence results of our 
algorithm in Sections 4 and 5. All the proofs are contained in Sections 6 and 7. In Section 8, we collect 
numerical simulation results. Future research questions and conclusions are discussed in Sections 9 
and 10, respectively. 


2 Problem Formulation 

Notation. For any two vectors x,y e M", the standard inner product is indicated as while 

its induced (Euclidean) norm is represented as ||a;|| 2 - A vector x belongs to M" iff it is of size n and 
all its components are nonnegative (i.e., M" is the nonnegative orthant). For any vectors x e K", 
its components are indicated by Xi, i e {!,...,n}. The vector In is the column vector of length n 
containing only ones. We indicate by In the identity matrix of size n. For any real-valued squared 
matrix X e we say X > 0 or X < 0 iff the matrix is positive semi-definite or negative semi- 

definite, respectively. We also write X e §" , iff X >0. For any real-valued squared matrix X e M"^", 
the norm ||X||p represents the Frobenius norm, while the trace is indicated by tr[X]. The symbol 
(■)^ is the transpose operator, ® represents the Kronecker product, o stands for map composition, 
conv[-] is the convex hull, vec(') is the vectorization operator, while Px['] is the projection operator 
onto the set X. The e-subgradient of a concave function q{x) : X c M" ^ M, for the non-negative 
scalar e ^ 0, at a:' e X is a vector g e M" such that 

(g, y-x')^ q{y) - q{x') - e, \/y e X. (1) 

Furthermore, the collection of e-subgradients of q{x) at x' is called the e-subdifferential set, denoted 
by deqx{x'). If e = 0 the e-subgradient is the regular subgradient and we drop the e in the notation 
of the subdifferential. 

Formulation. We consider a convex optimization problem defined on a network of computing and 
communicating nodes. Let the nodes be labeled with i e V = {I,... ,n} and we equip each of them 
with the local (private) convex function fi{xi) : M ^ M. Let x be the stacked vector of all the local 
decision variables, i.e., x = {xi,... ,Xn)^■ Let the functions gi{xi) : M ^ M,i e F be convex. Let 
Ao,Ai,i e V he d X d real-valued square and symmetric matrices. Let X^ c M,i e F be convex and 
compact sets, and let X := YlieV^i- interested in solving decomposable convex optimization 

problems of the form. 


minimize 

xieXi,ieV 

/(®) := E 

(2a) 

ieV 


subject to 

Yj gi{xi) ^ 0, 

(2b) 


ieV 



Ao -1- ^ AiXi > 0. 

(2c) 


ieV 


In order to simplify our notation (and without loss of generality) we have chosen to work with scalar 
decision variables Xi , with one scalar inequality, and with one linear matrix inequality. The following 
assumptions are in place. 

Assumption 2.1 (Convexity and compactness) The cost functions fi{xi) and the constraint functions 
gi{xi) are convex in Xi for each i. The sets Xi are convex and compact (thus, bounded). The matrices 
Ao,Ai,i e V are real-valued square and symmetric. 
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Assumption 2.2 (Existence of solution) The feasible set F := {x e X|(2b) and (2c)} is nonempty; for 
all X e F the cost funetion fix) > —oo, and there exists a vector x e F such that f{x) < oo. 

Assumption 2.3 (Slater condition) There exists a vector x e M" that is strictly feasible for problem (2), 
i.e., 


XI 5i(*j) < 0, and Aq + X ^ 0- 

ieV ieV 

Assumption 2.4 (Communication network) The computing nodes eommunieate synchronously via undi¬ 
rected time-invariant eommunieation links. 

Assumption 2.1 is required to ensure a convex program with compact feasible set. Assumption 2.2 
ensures the existence of a solution for the optimization problem (2). Let x* be such a (possibly not 
unique) solution (i.e., a minimizer) and let f* be the unique minimum. Assumption 2.3 is often re¬ 
quired in dual decomposition approaches in order to guarantee zero duality gap and to be able to derive 
the optimal value of the optimization problem (2) by solving its dual. In addition, Slater condition 
helps in bounding the dual variables, which is crucial in our convergence analysis. Assumption 2.4 
is required to simplify the convergence analysis. One might be able to loosen it and require only 
asynchronous communications, but this is left for future research since it is not the core idea of this 
paper. By Assumption 2.4, we can dehne an undirected communication graph Q consisting of a vertex 
set V as well as an edge set E. For each node i, we call neighborhood, or N;, the set of the nodes it 
can communicate with. 

The main research problem we tackle in this paper can be stated as follows. 

Research problem: we would like to devise an algorithm that enables each node, by communicating with 
their neighbors only, to construct a sequence of approximate local optimizers {x\}, for whieh their primal 
objeetive sequenee {f{x^)} eventually converges to f* (possibly) up to a bounded error floor. 

Our approach towards this problem is to devise a consensus-based dual decomposition with ap¬ 
proximate primal recovery. 

Sample applications. Problems as (2) appear in many contexts: the first example we cite is the 
network utility maximization (NUM) problem, where a group of communication nodes try to maxi¬ 
mize their utility subject to a resource allocation constraint [23, 24]. NUM problems are very relevant 
in communication systems. Generalizations of NUM problems, where the cost function is separable 
and the variables are constrained by linear inequalities, can also be handled by (2), and have been 
considered, e.g., in model predictive controller design [25] (which is one of the workhorse of nowadays 
control theory). Another sample application is sensor selection, where a set of nodes try to decide 
which one of them should be activated to perform a certain task based on a given metric. This 
is in general a combinatorial problem, yet it can be relaxed to a semidefinite program, which is a 
generalization of (2), [26, 27]. In the latter example, the constraint (2c) plays an important role. 

Multi-agent/Multiuser/Networked problems. If the constraints (2b) and (2c) involve only local 
functions, that is the sum is only over the neighbors of a particular i, then we have what is known as 
multi-agent (or multiuser, or networked) problem. These problems can be further complicated by the 
presence of global decision variables. In all these cases, due to the presence of neighborhood constraint 
functions only, the dual variables associated to them can be computed locally in the neighborhood 
(we can refer to them as link dual variables). Therefore, by a proper use of dual decomposition, we can 
devise distributed algorithms that can be implemented on nodes and connecting links. Relevant recent 
work on these problems is reported in [28-35]. In our case, the constraints (2b)-(2c) involve constraint 
functions from all the nodes, in all the decision variables together; therefore, the proposed methods 
for multi-agent problems cannot be directly applied in our case. In general, the link dual variables 
become a network-wide dual variable in our case, and we retrieve the standard dual decomposition 
scheme with the need for a master node to compute such a global network-wide dual variable. 
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3 Dual Decomposition 

The Lagrangian function L(x, p, G) : M" x]R+ xS+ —> M is formed, as a first step of dual decomposition, 

L{x,fi,G) := Yj +p( Y + Y (3) 

ieV ieV ieV 

where p e M+ is the dual variable associated with the constraint (2b), and G e §+ is the dual variable 
associated with (2c). Further, the dual function q{fi, G) : K+ x —> K can be defined as 

<j(p, G) := min{L(a;,p, G)}. (4) 

xeX 

The set X is compact, which means that the function G) is continuous on R+ x S+. Furthermore, 
the function g(/i, G) is concave. For any pair of dual variables (p, G), we can compute the value of 
the primal minimizers and their set: 

i := argmin{L(a:,/i, G)}, X := {x e X\q{fi,G) = L{x, ^,G)}. (5) 

xeX 

Given the compactness of X and the form of the dual function (4), we can define the subdifferential 
of q{n, G) at fi and G as the following sets 

Sqti{n,G) := conv|^ Y 9 t{xi)\x e xj, (6a) 

dqdlJ-yG) ■= conv|^ - + Y I® e xj, (6b) 

ieV 

Subgradient choices for q{fi, G) are therefore 

•= S ^ Q{^) ■= -^0 - Y (7) 

iey iey 

for any choice oi x e X. In addition, since X is compact and the constraints (2b)-(2c) are represented 
by continuous functions, the subgradients are bounded, and we set, for all i e F 

\\hi{x )\\2 ^ max 3 i(a;i) =: L, ||Qj(a;)||F ^ max - Ao/n - AiXi\ =: Q, (8) 

xieXi II 112 xteXi II IlF 

where we have defined hi{x) := gi{xi), and Qi(x) := —Aoln — AiXi. Finally, the Lagrangian dual 
problem can be written as 

q* := sup {q{g,G)}, (9) 

peR.GeS^ 

and by Slater condition (Assumption 2.3), strong duality holds: q* = f*■ 

Since the original convex optimization problem (2) is decomposable, the Lagrangian function is 
separable as 

L{x,ij.,G)=Y + fJ-gtixi)-tT^(^Ao/n +AiXi'^G^j =:^ Li(®i,/i,G), (10) 

ieV ^ ieV 

and so is the dual function 

<j(p, G):= V min {Lj(a;j,p, G)} := V ( 7 i(p, G), (11) 

and its subgradients. 

Dual decomposition with approximate primal recovery as defined in [4] is summarized in the 
following algorithm. 
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Dual decomposition with primal recovery _ 

1. Initialize /i° e M+, G° e §+, choose a constant stepsize a; 

2. Local dual optimization: compute in parallel the local dual functions and their primal optimizers 


min{L,(*i,/r^G«)}, if = argmin{L,(a:„ G'=)}, 

XiSXi 

as well as their subgradients gi{xf) and —Aoln — Aixf; 

3. Primal recovery step: compute in parallel the ergodic sum, for k ^ 1 

, k 

k i v' -t 
= r Zj®*; 


(12a) 


(12b) 


update the variables 

P^G'= 

as 




= Pr.^ 

r fc 

[U 

+ « X! 

(12c) 




i€V 


Gfc+i 

= Ps^ 1 

Gk 

- a(^Aq + Aiif jj. 

(12d) 




ieV 



This algorithm generates a converging sequence } as detailed in the following theorem. 

Theorem 3.1 Let the sequence G^, x^} be generated by the iterations in (12). Let L and Q be defined 
as in (8). Under Assumptions 2.1 till 2.3, 

(a) the dual variables are bounded, i.e., ||/i^||2 ^ Tq < QO, ||G^||f ^ Fq < co, for all 1; 


(b) an upper bound on the primal cost of the vector , k ^ 1, is given bi 

^ + an\L^^ + Q^) _ 

(c) a lower bound on the primal cost of the vector x^, fc ^ 1, is given by 


/(*'=) 

Proof The proof follows from [4, Lemma 3 and Proposition 1]. Since our optimization problem involves 
also a linear matrix inequality, some extra steps are needed in the proof of part (c). To be more specific, 
by following the same steps in the proof of [4, Proposition l.(c)], we arrive at the following inequality 

fix’^) p*h{x'^) - ti[Q{x’^)G*]. (13) 


where /r* > 0 and G* > 0 are the optimal dual variables. We now need to find an lower bound for 
the rightmost term of (13). By similar arguments of the proof of [4, Proposition l.(a)], we obtain for 
all fc > 1 

Given the two positive semi-definite matrices X and Y of dimension n x n, we know that tr[X T] 5= 
Ainin(X)tr[T] 5: 0, [36, Lemma 1], which means 

tr|^^^^ — Q(a:^)^G*j > 0, thus tr|^^^j-jG*j > tr[Q(a:*)G*]. 

This implies that for k ^ 1 

tr[Q(a:'")G*] ^ = krT^G*]! ^||g''||f||G*||f ^ (15) 

Lafc J I Lafc Jl afc ak 

where we have used Cauchy-Schwarz inequality [37]. By combining (15) and (14) with (13), we obtain 
the lower bound 

fix’^) p.*h{x’^) - ti[Q{x^)G*] > /* - 4 - 


and the claim is proven. 
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Although, the dual decomposition method of [4] presents several advantages, in practice, the nodes 
will need to sum the subgradients coming from the whole network in Step 4 in order to maintain 
common dual variables. This is often not practical in large networks, because it would call for a 
significant communication overhead. 

In the following sections, (i) we propose a consensus-based dual decomposition with primal recov¬ 
ery mechanism to modify Step 4 in order to make it suitable for limited information exchange (i.e., 
communication only with neighboring nodes); (ii) we prove dual and primal objective convergence 
of the proposed method up to a bounded error floor which depends (among other things) on the 
number of communication exchange with the neighboring nodes for each iteration k. 


4 Basic Relations 

Lemma 4.1 Suppose Assumption 2.1 till 2.3 hold. Let fl^0,G>0bea pair of dual variables for which 
the set D := {{pL ^ Q,G > 0)|g(/i, G) ^ q(p, G)} is nonempty. Then, the set D is bounded and we have 

max_ ||p ||2 ■+ |!G||f ^ G)), 

(mG)eD 7 

where 7 := min | —<?i( 5 Ci), Amin(A q + 2 i 6 y^i®i)}’ '^min(') fs the smallest eigenvalue and x is a 

veetor satisfying the Slater condition. 

Proof The lemma follows from [4, Lemma 1] with minor modihcations. In particular, we use [36, 
Lemma 1] to bound the inner product 

tr|^^A.o + ^ AjXj^G j ^ Amin^ A.ja:j^tr[G ], 

ieV ieV 

and the fact that ||G||f ^ tr[G], [37]. The remaining steps are omitted since similar to [4, Lemma 1]. 

□ 


It follows from the result of the preceding lemma that under Slater, the dual optimal set D* is 
nonempty. Since D* := {(p Ss 0,G > 0)|g(/i, G) 5 s q*}, by using Lemma 4.1, we obtain 

,G*)eD* 7 


Furthermore, although the dual optimal value q* is not a priori available, one can compute a looser 
bound by computing the dual function for some couple {ft ^ 0, G > 0). Owning to optimality, 
q* > q{ft, G), thus 

i n* + I|G*||f s: ^{f{x)-q{ft,G)). 

This result is quite useful to render the dual decomposition method easier to study. In fact, as in [4], 
we can modify the sets over which we project in Step 4 by considering a bounded superset of the 
dual optimal solution set. This means that we can substitute Step 4 in (12) with 


= Pd, [/ + o E 5i(7)l, 73^ := > 0 I ||p||2 ^ 

ieV ^ 

G'=+^ = Poa [g'^ - «(^0 + E 

ieV 

Dg := {g > 0 I IIGIIf ^ + r} 


(16a) 


(16b) 


for a given scalar r > 0. The nice feature of this modihcation is that both and Dq are now compact 
convex sets. This does not increase computational complexity, and it is a useful modihcation, for it 
provides a leverage to derive the convergence rate results. In the following, for convergence purposes. 


we will use r ^ 


f(x)-q(fi,G) 


7 
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5 Consensus-Based Dual Decomposition 


We consider now a consensus-based update to enforce the update rule of dual decomposition in (16) 
to fit the constraint of a limited communication network. Our approach is inspired by the one of 
[18] but applied to the dual domain. First of all, we dehne a consensus matrix W g with the 

following properties: 


[W]^j 


OifjtN.Kj {i}, W = W~^, Win 


In, p 



1 iT 

-rn J-n 

n 


U < 1, 


(17) 


where p[-] returns the spectral radius and u is an upper bound on the value of the spectral radius. It 
is a common practice to generate such consensus matrices; a possible choice is the Metropolis-Hasting 
weighting matrix [38, 39]. 

A consensus iteration is a linear mapping C(x) : x Wx with the property that the result of its 
repeated application converges to the mean of the initial vector, i.e., for x e M" 

1 lT 

lim C o C o ■ ■ ■ o C{x) = lim W'^x = " ” a:. 

¥,—>00'-V- ' ip—^cc n 

ip times 

This averaging property is ensured, for example, by conditions as the ones in (17). In addition, given 
the structure of W in (17), each consensus iteration involves only local communications (only the 
neighboring nodes will share their local variables), which will be the key point of our modification. 
In the following, we will study multiple consensus steps, in the sense that the computing nodes 
will run multiple consensus iterations (each of which involving only local communications) between 
subsequent iterations fc’s. We let the number of consensus steps be y, g N. In this case, the consensus 
mapping will be of the form x h-> W'^x. Since we will enable each node to generate its own dual 
variables on which consensus will be enforced, we start by dehning local versions of p, and G as 
Pi G R+ and Gi g S+, respectively. Next, we define our consensus-based dual decomposition as the 
following algorithm. 


Consensus-based dual decomposition with primal recovery 

(CoBa-DD) _ 

1. Initialize Pi e R+, G° g i g F, choose a > 0, determine a Slater vector x and the sets 

and Dq of (16) with an arbitrarily picked p,G and a scalar r ^ ^ number of 

consensus steps (p; 

2. Local dual optimization: compute in parallel the local dual functions and their primal optimizers 

qi{p'i,G^) = Ynin{Li{xi,p\,Gt)}, x\ = &rgmin{Li{xi, p'i ,G^)}, (18a) 

Xi^Xi xteXi 

as well as their subgradients gi{x^) and —AqIu — AiX^] 

3. Primal recovery step: compute in parallel the ergodic sum, for k ^ 1 


k 

Xj 


1 ^ 
^ t=l 


(18b) 


4. Update the dual variables p^, G\ as 

= Pt?. f E ( 18 c) 

ieV 

G-= Pdc f E {g'j - «(^0/n + A^x’;))]. (18d) 

jeV 


We highlight that the proposed algorithm CoBa-DD (or (18)) involves only local communication. 
The only communication involved is in the ip consensus steps, each of which requiring the nodes 
to share information with their neighbors. Also, note that computing (/(») — q{p,G))/y (for the 
dehnition of Dp and Dq) is not a very difficult task, since a Slater vector is usually easy to find by 
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inspection, and both f{x) and 7 can be computed by a consensus algorithm run in the initialization 
step of CoBa-DD. 

In order to analyze dual and primal convergence of (18), we start by some basic results. First, 
given that the sets Dfi and Dq are compact, and that /i° and are picked to be bounded, the dual 
variables pf and are bounded for each fc ^ 0. In particular, we have 

ll/tf II2 ^ /I < 00, WGiWp ^ r < 00. ( 19 ) 

Lemma 5.1 Let q{x) : X —> R 6e a concave function. Let the set X cz R" be convex and compact, and in 
particular max^-gx ll®l |2 ^ V- There exist two finite scalars C > 0 and r > 0 such that, for all x e X, for 
all g(x) e dqx{x), and for all vectors v e R” with ||i .'||2 ^ t, the following holds 

g{x) + iz e dQqx{x). 

Proof The claim is proven by using the definition of subgradient of a concave function (1). Since q is 
a concave function, for all a:, y e X, e R", 

q{y) - q(x) ^ (g{x),y - x} = (g{x) + u,y - x) - {u,y - x) 

^ {g(x) + u,y - x) + lli^blly - x \\2 ^ (g{x) + v,y-x) + 2 ry. 

For T ^ C,/{2rf), the claim follows. □ 

Lemma 5.2 Let the initial dual variables in (18), /if and Gf for all i e V, be bounded. Let W satisfy the 
conditions (17). Then, the following quantity is bounded by a certain cq > 0, 

I ^ — lnl(f/n]jj (^fij + ayj(a:f)jj| -I- 

jeV ^ 

I ^ - InllM.j (g° - a{Ao/n + || < cq, Vi e V. ( 20 ) 

Proof The proof follows given the compactness of X and (therefore) the boundedness of the subgra¬ 
dients. □ 


We now present the main convergence results. 


Theorem 5.1 (Dual variable agreement) Let y^,G^ be the mean values of the dual variables generated 
via the algorithm (18), i.e., 

-k 1 fc 1 ^k 

^ ^€V ” ^€V 

Let Assumptions 2.1 till 2.3 hold and let W satisfy the conditions (17). Let /tf and Gf fori e V be bounded 
and let fio ^ cq, with cq defined as in (20). Define L and Q as in (8) and let 


M := L + Q, p : = 




Po + aM 

There exists a number of consensus iterations (p, such that if + 5^0, 1, then the dual 

variables reaeh consensus as 


ll^fe+i _ _fc+i||^ ^ 2p^-^v^Po + 2paM^ ^ 


k-1 


l-p 


IIG^+i _ g'^+IiIf C 2p'="V'^/3o + 2paM^ P 


k-1 


Furthermore, 


if, = 


l-p 


log(/3o) - log(4n(l -I- dP){Po + aM)) 


VieV, 


Vie F. 


l 0 g(!i) 

Corollary 5.1 Linder the same conditions of Theorem 5.1, we obtain 


r II k -k„ ^ 2paM 

hm II/i, -/i II 2 ^ -j--, ..... 

k—>(X) ^ ~ P fc^OO 


T 11/^^ /^^ll ^ ‘Ip 0(.M 

lim IIG-.,' — G p ^ — - , V 2 G V. 

i-p 
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Theorem 5.1 and Corollary 5.1 specify how the consensus is reached among the nodes on the 
value of the dual variables while the algorithm (18) is running. Specifically, the consensus is reached 
exponentially fast to a steady-state bounded error floor. This bounded error depends on a (which 
can be tuned), and on p, which can also be tuned by varying ip. In particular, for ip —>■ cc, due to 
the fact that < 1 in conditions (17), then p = 0 and we obtain back the usual dual decomposition 
scheme with perfect agreement among the nodes. 


Remark 5.1 Computing the lower bound on the number of consensus steps ip can be done during the 
initialization of the algorithm. We can always pick /3o big enough so that Po » oM, which means that 
ip can be simplified as = ^°s(i/(4ra(is-d ))) ^ ] 3 g determined in a distributed way [40]. 


Theorem 5.2 (Dual objective convergence) Let be the dual variables generated via the algo¬ 

rithm (18). Let jJl and G° for all i e V be bounded and let /3o be defined as in Theorem 5.1. Define L 
and Q as in (8) and let M := L + Q. Choose a scalar r such that fio/a ^ r. Let be defined as in 
Lemma 5.1 for the concave function q{jj,,G) and the choice of t. Let q* he the optimal value of q{ij,, G). 
Let Assumptions 2.1 till 2.3 hold and let W satisfy the conditions (17). Let p ^ p + S,S ^ 0 and let p be 
defined as in Theorem 5.1. The following holds true. 

If q* = 00 , then 


\imsupq{pi ,Gi) = CO, MieV, 
fc—>00 


If q* < GO, then 


limsup( 5 r(p^, Gi) ^ q* — an(M + r)^/2 — n(/3oo(9M + 3r) + C), Vi e V, 

k—*(X) 

with Pec = and p = . 

Theorem 5.2 implies dual objective convergence up to a bounded error floor. Convergence is even 
more evident if we remember that, owning to optimality, q{jj,^,G^) ^ q*, and thus, if we define 
g® := q{pf ,Gi), we obtain 

qf — q* ^ —an{M + r)^/2 — n{p(x>{9M + 3t) + Q =: —e^. 

Note that the rightmost term (—e^) represents a measure of sub-optimality of the approximate 
solution. 


Theorem 5.3 (Primal objective convergence) Let , G^,x^ be the dual and primal variables generated 
via the algotithm (18). Let and for all i e V be bounded and let Pq be defined as in Theorem 5.1. 
Define L and Q as in (8), A and T as in (19), and let M := L + Q. Choose a scalar r such that Po/a ^ r. 
Let p be defined as in Lemma 5.1 for the concave function q{p, G) and the choice of t. Let f* be the optimal 
value of f{x). Let Assumptions 2.1 till 2.3 hold and let W satisfy the conditions (17). Let p^p-p5,5^0 
and let p he defined as in Theorem 5.1. The following holds true. 

(a) An upper bound on the primal cost of the vector x^, k ^ 1, is given by 


fix'^) ^f + 


2kafn 


+ fife! 


(b) A lower bound on the primal cost of the vector x^, k^l, is given by 


f(x^) ^ f* 


9{A^ + r^) 

2kafn 


where 




an{M + r)^ 


-t nr(yl -I- T) -I- n{po{6M -I- 3r) -t- (). 


2 
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Theorem 5.3 formulates convergence of the primal cost up to an error bound e^. The rate of 
convergence is 0{l/k). We can also distinguish the error terms that come from the constant stepsize 
a and the terms that come from the finite number of consensus steps (p. In particular, we can write 




anM^ ^ an{2MT + r^) 
2 2 
( 1 ) 


-I- nr(/l -I- T) -I- n{/3o{6M + 3t) -I- C), 
( 2 ) 


and see that the term ( 1 ) is due to the constant stepsize, while the term ( 2 ) is due to the finite 
number of consensus steps. Furthermore, if ^ > oo, then cq = 0, and we can set /3o = t = C = 0, 

yielding 

anM^ 

hm Cfc = —-—. 

Z 

This is similar to the error level we obtain for the dual decomposition method in (12), and Theo¬ 
rem 3.1. Theorem 5.3 defines the main trade-offs in designing the algorithm’s parameters a and ip. 
The larger the stepsize a is, the faster the convergence is, even though the steady-state error becomes 
larger. If we increase p then the communication effort increases and the error decreases. 


6 Proof of Theorem 5.1 and Theorem 5.2 

6.1 Preliminaries 

We start our analysis by rewriting Step 4 of (18) in a more compact way. Let Zi e be the vector 

defined as zi := (/r,, vec(Gi)^)^, and let Zsv be the stacked vector of all the Zi, ieV. Similarly, let 
hi(x) be the vector hi{x) := {gi{xi),vec{—Ao/n — zljXj)^)^, and let hsv{x) the stacked vector of all 
the hi{x), i e V. Let Z be the convex set 

Z := {z := ifi, vec{Gf)'^ e |p e G g Dg}, ( 21 ) 

and let Zsv = Y\a=\ The iterations in Step 4 of (18) can be rewritten as 

(8) Ii+d- (4v + ahsv(i'=))j. (22) 

The iteration (22) represents a consensus-based subgradient method to maximize the dual function 
q{jj,,G), i.e, the maximization problem 

q* := max V qi(p,G) = max V q^{z), for z = (p, vec(G)^)^. 

^eD„GeDa 

In particular (22) assigns to each node a copy of z, z^, and enforces consensus among them. Further¬ 
more, by (8), by triangle inequality, and by (19), 

\\hi{x)\\2 ^ \\hi{x)\\2 + \\Qi{x)\\p = L + Q = M, \\hsv{x)\\2 ^ nM, (23a) 

max ||z ||2 ^ + r'^ ^ A + F. (23b) 

zeZ 

Lemma 6.1 ([18, Lemma 1]) Let e e V be m-dimensional vectors. Let x be the average value 

ofxi,is V, i.e., X = The following basic relations hold, 

(a) tf \\xi - Xj\\2 ^ P, yi,j e V, then \\xi - x\\2 ^ ^^P; 

(b) if \\xi — x\\2 ^ P,'iie V, then \\xi — Xj\\2 ^ 2 / 3 . 

Lemma 6.2 ([18, Lemma 2]) Let x^ e M” be an n-dimensional vector, with components Xj g M, i = 
1,. .. ,n. Let x^'^^ = W^x^, with W G fulfilling conditions (17). Let \\xf — a ;^||2 ^ n, for a bounded 

a, and for all i, j = 1,... ,n. Then — *^”*"^112 ^ 2v‘^na for all i,j = 1, ..., n. 
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Lemma 6.3 Let {zsv} be generated by (22) under Assumptions 2.1 till 2.3. Let vf e , for all i eV 


be defined as 


= J] [zj + ahj{x’')^ , 


and let be the average value of v^fi e V, i.e., There exists a <p ^ 1, such that if 

<p ^ ifi + S with 5^0, then 

\\v'f ^ l3,yieV ||■yf+^-u '=+^||2 ^ Vie Kfc 5: 0. 

Proof The proof is an adaptation of [18, Lemma 3]. In particular, we can show that for all i,jeV 
\\vi—v^\\2^fi ^ 4i/‘^n(l + ci^)(/3 + oM). (24) 


Therefore, if we choose, 


^ log(/3) - log(4n(l + d^)(/3 + aM)) ^ ^ ^ 
log(z^) ’ ^ ’ 


then, \v^ — v ^||2 ^ /3, Vi e T => ^ fi, ^i,j e V, 

and the claim follows from Lemma 6.1.(a). In order to prove (24), we proceed as follows. 


||nf — d ^||2 ^ P, yi e V 


\v^ - vf b ^ 2/3, Vi, j e V 


Lemma 6.1 


[vf - v %\\2 C 2/3, Vi, j eV,i=l,...,l + d^, 


where [-J^ extracts the £-th component of a vector. Define 

= Pzlvh + Vi e T. 

Prior to consensus, the distance between the iterates can be bounded as 


II k+1 fc+1 I 

\\Ux —Ua 


2 ^ l|Pzm] + ahi{x ^ ) - Pzl'i^j] - ahj{x ^ )\\2 

^ \\Pz[vh - Pz[vi ]\\2 + 2aM ^ ||nf - + 2aM ^ 2(/3 + aM), 


which also implies ||[uf — n^J^b C 2(/3 + aM). Given that = P[vf], Vi, after consensus, we have 


\\v1+^-v’;+^\\2= 


- J] [w^]jpu^p+^ 


i=l peV 


^p4+^ - 2 [w%p 4+^] 

peV * 


= E \\J][wn^plup^^%- J][wnjpi4^%\\ 

t=l peV peV 




where = ([i(.^+^]^,..., [ 11 ^+^]^)'''. As said 2(/3 + aM) which means ||[u^]i — 

[iifljb ^ 2(/3 + aM). Thus, by using Lemma 6.2 we can bound (25) as 


Ibfb ^ E [W^u’l+% - + % ^ ^v^n{l + d^){P + aM), 


which is the rightmost term in (24) and the claim is proven. 
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6.2 Proof of Theorem 5.1 

The quantity ||u°—'ii °||2 is upper bounded by Pq ^ cq by Lemma 5.2 (inequality (20)), thus, ||u°—11*^112 C 
Pq. Let us choose ^ ^+5, 5^0, with ip determined as in Theorem 5.1. Then, by Lemma 6.3 and (24), 
it follows that. 


II 1 -111 ^ Sn 

\\Vi — V \\2 ^ PO 


\vi — v^\2 ^ 4i^‘^n(l -I- d^){i2^Po + aM) = Po 


Sa ^ PO + aM 


Po + aM 


II 3 —3 II ^ On 

— V II 2 ^ PO 


w k -k \\ ^ So 

\\Vi - V II 2 ^ Z^ PO 


<5o z/‘^/3o (v^Po + aM 


■ aM 


Po + aM \ Po + aM 
k-l 

+ aM {-1+ Yj 

\ t=0 


Po + aM 


k-l 


Z^Vo 

Po + aM 


Let P ■■= 0g+aM > ™ee p < 1, then 


1 k 1 

\\vt - v'^h ^ p^~^v^Po +paM —- =: p^, k ^ 1 

l-p 


(26) 


and by Lemma 6.1.(b), we derive \\vf — Vj \\2 ^ 2/3^,. By using the non-expansive property of the 
projection operator, since = P[uf], for all i, we can write 


||zf+^ - ^ Wv'i - tZj II2 ^ 2 ^fc, fc ^ 1 , 


(27) 


and by Lemma 6.1.(a) the claim follows. 


6.3 Proof of Theorem 5.2 

We define an average value for as z* = ■ Lor convergence purposes, we need to keep 

track of the difference z*"*"^ — Pz[v^], and thus we define the vectors e and e as 

y'^ ■.= Pz[v'‘~^], d’" := z’'- y’", k^l. (28) 

The main idea of the proof is to show that y is updated via an approximate e-subgradient method 
and, then, by using [41, Proposition 4.1] the theorem follows. The first part is formalized in the 
following lemma. 

Lemma 6.4 Let y^ he defined as in (28). Under the same conditions of Theorem 5.2, for all k ^ 1, 

(a) The quantity ||d^/a ||2 is upper bounded by P}._i/a ^ r (where p}^ is defined in (26)); 

(b) The following inequalities are true, for all i e V 

SS l{zi) + 3nMpk_i (29) 

1 i{y) ^ + k',y - y^y + Ck/n, ^yeZ. (30) 

(c) The quantity g{x^) '.= {hiix^) + is an ej^-subgradient of q{y^) with respect to y. 

(d) The variable y^ is updated via an e-subgradient method 

yk+l ^ g (y'^ ) . (3l) 


And efe = n{pk-\{6M -I- 3r) -I- Q. 
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Proof 

(a) We start by bounding ||d^|| 2 , 


\\jk II 

H II2 




ieV 


Pzlvt^]-Pzlv'‘~^ 




k—l —k —1 II ^ n 

\v^ -V ||2^/3fc_i, 


ieV 


where we have used the inequality (26) to bound the term 

(b) Since e Z and e Z, by the concavity of qi{z) and the definition of subgradient of a concave 
function (1), we can write for all i,jeV 

Qjiv'') ^ + (h, y - zfy, where h e dqj^z{zf) 

^ qj{zf) + \\hUz>y - /II2 ^ qj{zf) + MiWz'l - z^h + Wd'^h) 

^ 1j{zi) + Af(2/3fc_i + Pk-i) 1j{y^) + 3M/3fc_i. 


In particular, we have used the fact that any subgradient vector of qj{z) is bounded by M (23a), and 
inequality (27). If we sum the last relation over j e V, we obtain (29). In addition for any y e Z, by 
using Lemma 5.1 


?i(y) ^ 1i{zi) + {hi{x^),y-Zi) ^ qi{zi) + {hi{x^) + u,y-Zi) + ( 

^ '?i(y^) + {hi{x^) + V, y-Zi) + SMpk-i + C 
= ^(y*) + + v,y-y^ + y- 2 f> + 3M/3fc_i+C 

+ {hi{x") + v,y-y’")+\\hi{x'^) + v\\2\\y'^-Zi\\2+ZMI5k-i+C,- 

We use the fact that ||i /||2 ^ r by construction in Lemma 5.1, \\hi{x^) II 2 ^ M by (23a), \\z^ - z% ^ 
2/3fc_i by (27), and ||d '"||2 ^ Pk-i by the preceding proof. By using these inequalities, we can bound 

\\hi{x^) + v\\2 ^ M + r, \\y^ - zf\\2 = \\zi - z^ + d'"||2 sS ^Pk-l, 

and we obtain 

(li{y) sS ?i(y^) + {hi{x) + v,y-y’")+ (/3fc_i(6M + 3 t) + C), 

which is (30). 

(c) By using the definition of subdifferential (1), the inequality (30) implies {hpx) + v) e d^^jn9i,y{y) 
with ej,/n = (/3fc_i(6M + 3t) + Q. Summation over i yields, 

i{y) < ^(y*) + { E + k’,y- y'"^ + n(/3fc_i(6M + 3r) + C), 

ieV 

for any i>, such that ||i/|| ^ r. Since ||<i*/a ||2 ^ r by construction, then we can choose ly = dPja, from 
which the claim follows. 

(d) It is sufficient to write explicitly the update rule for y*. Starting from the definition of y^^^ 
in (28) and the definition of uf in Lemma 6.3, we obtain 

E E E + 0 ^( 5 '=))] 

ieV ieV jeV 

= E(^»* +ahi(i''))j = Pz[y’^ + + “ E 

ieV ieV 



□ 


Given part (c) oi this Lemma, the claim follows. 
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Proof (of Theorem 5.2) By Lemma 6.4, the sequence {y^} is generated via an subgradient algorithm 
to maximize q{y)- And in particular, k ^ 1 

yk+i ^ _l_ a/ng{x’")], ||g(i '')||2 ^ n(M -I- r). 

Therefore, we can use any standard result on the convergence of approximate subgradient algorithms. 
E.g., by using [41, Proposition 4.1] (with m = 1), the following holds for the sequence {y^}, 

If q* = 00 , then 

limsupg(j/^) = oo, 
fe—»^00 

If q* < 00 , then 

hmsup(j(i/^) ^ q* — an{M + r)^/2 — n(/3oo(6M -I- 3r) -I- C), 
fc —>00 

where /3oo = limfc_>QQ Pk-i- Then, from the inequality (29) the claim is proven. □ 


7 Primal Recovery: Proof of Theorem 5.3 

7.1 Some Basic Facts 

Lemma 7.1 Let y^ be defined as (28). Under the same assumptions and notation of Theorem 5.2, 

(a) For any y e Z, 

.i\ ^ lly^ -ylli , ,^an{M + rf 

>, y - i 

t = l 

(b) For any y ^ Z, 




Yj^9{x),y -y*) 


\\y^ -Vll , ,an{M TtY 


t=i 


‘lain 2 

r=i ' 

where tt = n{l3t-i{GM + 3r) -I- C). 

Proof We start from the update rule (31). For any y e Z, 

I k+l ii2 IIQ r fc . o; /-fc\l D r ill^ II I ^ 11 ^ 

\y -y \2 = \\Pz\y +-g{x )\-Pzvv\\\ y +-g{x)-y\\ 

II L n 1 Il2 II n Il2 

^ II fc \\2 . / / ~ k\ k \ , ‘2 f Ti /f I ^2 

^ y -y 2 + — {g[x ),y -y) + a{M + T). 
n 

where we use the fact that ||y(i^)||2 = II 2i6y(^i(®^) + d*/o)l|2 ^ n(M -I- r). Therefore, for any y e Z 


/ /-fc\ fc\ lly^ “ ylli “ lly^”*"^ “ y|l2 an(M + rY 
(gix'^),y - y'^} ^ ^ - 2a/n - - +- 2-^ 


(32) 


and by summing over k, part (a) follows. Since g{x^) is an ej;-subgradient of the dual function q at 
y^, using the subgradient inequality (1), 

{g{x^), y'" - y*> ^ q(y^) - q{y*) + e^ ^ Cfc, 

where the last inequality comes from the optimality condition q{y^) ^ q{y*), which is valid for any 
y^ G Z. In particular, ej, is dehned in Lemma 6.4:. (c). We then have, 

(g{x'"), y-y*) = {g{x^),y - y^) + {g{x"), y’" -y*)^ {g{x"), y - y^> + efc- 

From the preceding relation and (32), we obtain 

lain 1 


and summing over k part (b) follows as well. In particular, we remark that y^ = which is 

bounded, since Z is a compact set. □ 
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7.2 Proof of Theorem 5.3. (a) 


Proof By convexity of the primal cost f{x) and the definition of xf as a minimizer of the local 
Lagrangian functions over Xi e Xi, we have, 

^ t=i ^ t=i ieV 

By Lemma 6.4 inequality (30) with y ^ zl g Z, 

giiz*) - <?»(?/*) ^ (hi{x*),zl) + <iz,z-> - {hi{x) + + et/n, 


with et/n = /3t-i(6M + 3r) + (. Summing over i e V, 

X Qiizl) ^ g{y*) + X^^*(®*)’^i) + “ {g{x),y*) + et, 

ieV ieV ieV 


hence, 

/(®^) ^ I X ( 5 ( 2 /*) + X “ <9(®‘). y‘> + et) • (34) 

t=l ieV 

We can use Lemma 7.1. (a) with y = 0 e Z to upper bound —(g{x*),y*}, while we bound ||(iz,z *)||2 
as ||(i/,Zt )||2 ^ 'r(yl + r). The latter bonnd comes from the fact that by constrnction ||i /||2 ^ r, and 
||zi II 2 rl + T by (23a). With this in place, we can write (34) as 


/(.'=) ^ I 


k 

X + r) + 


2kafn 


an{M + r)^ 


k 

X- 


If we now compnte 


T X = T X + 3r) + C) < n(/3o(6M + 3r) + C), (35) 

^ t=l ^ t=l 

and remember that by optimality q{y*) ^ q*, q* = /* by strong dnality (Assumption 2.3), and 
lly^lli ^ , then the claim follows. □ 


7.3 Proof of Theorem 5.3. (b) 


Proof Given any dual optimal solution y*, we have 



We also know that, 

(a) = f(x^) + (y*, T X X ^i(®‘)) + ^(y*T X 

^ t=i ieV ^ t=i 

> /(®^) + (y*, X 

ieV 

where we used the fact that hi{x*) is a convex function of i* and therefore, 

T X X '"*(**) ^ X 

t=l*6y leV 

and the Cauchy-Schwarz inequality to bound 



^ -r(A + r). 


(36) 


( 37 ) 
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Furthermore, by the saddle point property of the Lagrangian function, i.e., for any x e X,y e Z 

L{x*,y) ^ L{x*,y*) iS L{x,y*), 


and the fact that under strong duality (Assumption 2.3) L{x*,y*) = q* = /*, we can write 

fix’") + (y*, ^ hiix’")'^ - nriA + F) = Lix’",y*) - nriA + F) ^ f* - nr(yl -I- F). (38) 

ieV 

We can now upper bound (v*, 9 (®*)^ in (36) as in Lemma TA.fbJ, with y = 2y* e Z (by the 

definition of r). By substituting this bound in (36) and by combining it with (37) and (38), we get 


fix’") ^ f* - nr(A -I- F) 


y^-2y*g anjM + T)'^ 1 ^ 

2ka/n 2 k 


From the upper bound (35), and \\y^ — 2 y *||2 = + 4 ||t/^|| 2 || 3/*||2 + 4 ||i/*|| 2 , which can be upper 

bounded as 9(A^ -I- F^), the claim follows. □ 


8 Numerical results 

In this section, we present some numerical results to assess the proposed algorithm for different (p 
values in comparison with the standard dual decomposition. We choose the following simple yet 
representative sample problem, 

33 

minimize fix) ^ ajXj 

^ [0, 1] i=l 

ie { 1 , ..., 100 } 

where each a* e [0,1] is drawn from a uniform random distribution. This type of problem has been 
considered e.g. in network utility maximization contexts [23]. We solve the problem in Matlab with 
Yalmip and SDPT3 [42, 43], where we also implement the proposed algorithm^. 

For this problem a Slater vector is a:^ = 0 for all i; furthermore 7 = 10, while (/(O) is solvable by 
inspection (xj = 1) and gives (for our realization of a^) r = 8.62. The communication network is a 
randomly selected and the average number of neighbors is 3.12. 

Figure 1 depicts convergence and it is in line with our theoretical findings: the error decreases as 
0(l/fc) till it reaches a bounded error floor. This bounded error floor depends on both p and a as 
captured in Theorem 5.3. We have also plotted the performance of the standard dual decomposition, 
which (in the absence of a master node), requires reaching complete consensus at each iteration (in 
theory 95 —> 00 , but we have set p = 26, which yields a full W^). 

Figure 2 shows the relative error with respect to the total number of messages the nodes are 
exchanging. We can see that, in the absence of a master node, the proposed consensus-based algorithm 
involves significantly fewer number of messages than the standard dual decomposition for the same 
accuracy level (till up to 1% error). This is very important in real life applications. 


100 100 

- ^ fTilog(l +Xi), subject to ^ (TiXi ^ 10 , 

z=34 i=l 


9 Future research questions 

Future research encompasses the following points. 

First of all, we have used the ergodic mean to recover the primal solution. The reason for it, is 
mainly technical: it helps to derive convergence rate results, via a telescopic cancellation argument. 
Other convex combinations have been advocated, e.g., in [12], but the results they can offer are typi¬ 
cally asymptotical, and require vanishing stepsizes. An open question is whether other combinations 
for primal recovery are possible using constant stepsizes. 


^ The code is available at; http;//ens.ewi.tudelft.nl/~asimonetto/NumericalExample.zip. 
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10 ° 10 ’ 10 ^ 10 ^ 10 " 10 = 10 = 
Number of iterations k 


^io“ 


t,10“ 


Fig. 1 Convergence of the proposed algorithm for different choices of stepsize a and number of consensus step ip. 



Fig. 2 Relative error and number of exchanged messages for different choices of stepsize a and number of consensus 
step ip. 


Then, in the derivation, we have limited ourselves to objective convergence. It wonld be relevant 
to investigate convergence of the ergodic mean to the optimizer set, either in the general convex case 
or in the strong convex scenario. 

Finally, The bonnd on ip, i.e., has been derived in snch a way that we could use e-subgradient 
arguments in the rest of the convergence proofs. However, it is quite conservative (in fact, in practice, 
ip can be as small as 1, but this is often not captured by the bound in Theorem 5.1). This is due to 
Lemma 6.2 and the use of the spectral radius as an upper bound. A thorough investigation is left for 
future research. 
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10 Conclusions 

A consensus-based dual decomposition scheme has been proposed to enable a network of collabo¬ 
rative computing nodes to generate approximate dual and primal solutions of a distributed convex 
optimization problem. We have proven convergence of the scheme both in the dual and the primal 
objective senses up to a bounded error floor. The proposed scheme is of theoretical and applied 
importance since it eliminates the need for a centralized entity (i.e., a master node) to collect the 
local subgradient information, by distributing this task among the nodes. This need has been a ma¬ 
jor hurdle in the use of dual decomposition for solving certain classes of distributed optimization 
problems. 
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